Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 248258 |
| Missing cells | 40673 |
| Missing cells (%) | 1.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 26.5 MiB |
| Average record size in memory | 112.0 B |
Variable types
| NUM | 9 |
|---|---|
| CAT | 3 |
| DATE | 1 |
| BOOL | 1 |
VERSIE has constant value "248258" | Constant |
DATUM_BESTAND has constant value "248258" | Constant |
PEILDATUM has constant value "248258" | Constant |
TYPERENDE_DIAGNOSE_CD has a high cardinality: 1766 distinct values | High cardinality |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
GEMIDDELDE_VERKOOPPRIJS has 40673 (16.4%) missing values | Missing |
AANTAL_SUBTRAJECT_PER_ZPD is highly skewed (γ1 = 21.19022168) | Skewed |
Reproduction
| Analysis started | 2020-09-06 22:36:21.016456 |
|---|---|
| Analysis finished | 2020-09-06 22:36:49.545942 |
| Duration | 28.53 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| 1 |
|---|
| Value | Count | Frequency (%) | |
| 1 | 248258 | 100.0% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| 2020-07-21 |
|---|
| Value | Count | Frequency (%) | |
| 2020-07-21 | 248258 | 100.0% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| 2020-07-01 |
|---|
| Value | Count | Frequency (%) | |
| 2020-07-01 | 248258 | 100.0% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
JAAR
Date
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| Minimum | 2012-01-01 00:00:00 |
|---|---|
| Maximum | 2020-01-01 00:00:00 |
Histogram with fixed size bins (bins=9)
BEHANDELEND_SPECIALISME_CD
Real number (ℝ≥0)
| Distinct | 27 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 423.6569899 |
|---|---|
| Minimum | 301 |
| Maximum | 8418 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 301 |
|---|---|
| 5-th percentile | 302 |
| Q1 | 305 |
| median | 313 |
| Q3 | 322 |
| 95-th percentile | 335 |
| Maximum | 8418 |
| Range | 8117 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 929.0263119 |
|---|---|
| Coefficient of variation (CV) | 2.192873797 |
| Kurtosis | 69.91205688 |
| Mean | 423.6569899 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 8.473463086 |
| Sum | 105176237 |
| Variance | 863089.8881 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=27)
| Value | Count | Frequency (%) | |
| 305 | 35119 | 14.1% | |
| 313 | 32268 | 13.0% | |
| 303 | 28592 | 11.5% | |
| 330 | 19976 | 8.0% | |
| 316 | 16903 | 6.8% | |
| 308 | 12492 | 5.0% | |
| 324 | 10303 | 4.2% | |
| 306 | 10283 | 4.1% | |
| 301 | 10130 | 4.1% | |
| 304 | 8132 | 3.3% | |
| Other values (17) | 64060 | 25.8% |
| Value | Count | Frequency (%) | |
| 301 | 10130 | 4.1% | |
| 302 | 5384 | 2.2% | |
| 303 | 28592 | 11.5% | |
| 304 | 8132 | 3.3% | |
| 305 | 35119 | 14.1% |
| Value | Count | Frequency (%) | |
| 8418 | 3301 | 1.3% | |
| 1900 | 162 | 0.1% | |
| 390 | 616 | 0.2% | |
| 389 | 2709 | 1.1% | |
| 362 | 3826 | 1.5% |
| Distinct | 1766 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| 101 | 1045 |
|---|---|
| 402 | 1025 |
| 403 | 994 |
| 301 | 993 |
| 201 | 940 |
| Other values (1761) |
| Value | Count | Frequency (%) | |
| 101 | 1045 | 0.4% | |
| 402 | 1025 | 0.4% | |
| 403 | 994 | 0.4% | |
| 301 | 993 | 0.4% | |
| 201 | 940 | 0.4% | |
| 203 | 936 | 0.4% | |
| 401 | 843 | 0.3% | |
| 404 | 830 | 0.3% | |
| 409 | 812 | 0.3% | |
| 802 | 803 | 0.3% | |
| Other values (1756) | 239037 | 96.3% |
Frequencies of value counts
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.347960589 |
| Min length | 2 |
ZORGPRODUCT_CD
Real number (ℝ≥0)
| Distinct | 5886 |
|---|---|
| Distinct (%) | 2.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 438593518.7 |
|---|---|
| Minimum | 10501002 |
| Maximum | 998418081 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 10501002 |
|---|---|
| 5-th percentile | 28999036 |
| Q1 | 99799023 |
| median | 149599012 |
| Q3 | 990004004 |
| 95-th percentile | 990416042 |
| Maximum | 998418081 |
| Range | 987917079 |
| Interquartile range (IQR) | 890204981 |
Descriptive statistics
| Standard deviation | 428548956.9 |
|---|---|
| Coefficient of variation (CV) | 0.9770982438 |
| Kurtosis | -1.726751426 |
| Mean | 438593518.7 |
| Median Absolute Deviation (MAD) | 119599999 |
| Skewness | 0.478481953 |
| Sum | 1.088843498e+14 |
| Variance | 1.836542084e+17 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 990004009 | 1844 | 0.7% | |
| 990004007 | 1798 | 0.7% | |
| 990003004 | 1758 | 0.7% | |
| 990004006 | 1416 | 0.6% | |
| 990356076 | 1246 | 0.5% | |
| 990356073 | 1151 | 0.5% | |
| 990003007 | 1142 | 0.5% | |
| 131999228 | 1128 | 0.5% | |
| 131999164 | 1106 | 0.4% | |
| 199299013 | 1047 | 0.4% | |
| Other values (5876) | 234622 | 94.5% |
| Value | Count | Frequency (%) | |
| 10501002 | 6 | < 0.1% | |
| 10501003 | 9 | < 0.1% | |
| 10501004 | 9 | < 0.1% | |
| 10501005 | 9 | < 0.1% | |
| 10501007 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 998418081 | 115 | < 0.1% | |
| 998418080 | 100 | < 0.1% | |
| 998418079 | 27 | < 0.1% | |
| 998418077 | 6 | < 0.1% | |
| 998418076 | 6 | < 0.1% |
| Distinct | 8621 |
|---|---|
| Distinct (%) | 3.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 493.6843123 |
|---|---|
| Minimum | 1 |
| Maximum | 153137 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 13 |
| Q3 | 96 |
| 95-th percentile | 1648 |
| Maximum | 153137 |
| Range | 153136 |
| Interquartile range (IQR) | 94 |
Descriptive statistics
| Standard deviation | 3086.960776 |
|---|---|
| Coefficient of variation (CV) | 6.252904334 |
| Kurtosis | 390.858878 |
| Mean | 493.6843123 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 16.55051804 |
| Sum | 122561080 |
| Variance | 9529326.831 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1 | 41787 | 16.8% | |
| 2 | 20547 | 8.3% | |
| 3 | 13214 | 5.3% | |
| 4 | 9830 | 4.0% | |
| 5 | 7613 | 3.1% | |
| 6 | 6396 | 2.6% | |
| 7 | 5357 | 2.2% | |
| 8 | 4434 | 1.8% | |
| 9 | 4062 | 1.6% | |
| 10 | 3647 | 1.5% | |
| Other values (8611) | 131371 | 52.9% |
| Value | Count | Frequency (%) | |
| 1 | 41787 | 16.8% | |
| 2 | 20547 | 8.3% | |
| 3 | 13214 | 5.3% | |
| 4 | 9830 | 4.0% | |
| 5 | 7613 | 3.1% |
| Value | Count | Frequency (%) | |
| 153137 | 1 | < 0.1% | |
| 152973 | 1 | < 0.1% | |
| 144742 | 1 | < 0.1% | |
| 133196 | 1 | < 0.1% | |
| 112286 | 1 | < 0.1% |
| Distinct | 9207 |
|---|---|
| Distinct (%) | 3.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 575.5768475 |
|---|---|
| Minimum | 1 |
| Maximum | 239907 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 14 |
| Q3 | 105 |
| 95-th percentile | 1860 |
| Maximum | 239907 |
| Range | 239906 |
| Interquartile range (IQR) | 102 |
Descriptive statistics
| Standard deviation | 3891.722316 |
|---|---|
| Coefficient of variation (CV) | 6.761429569 |
| Kurtosis | 720.4087631 |
| Mean | 575.5768475 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 21.19022168 |
| Sum | 142891557 |
| Variance | 15145502.58 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1 | 40331 | 16.2% | |
| 2 | 20200 | 8.1% | |
| 3 | 13111 | 5.3% | |
| 4 | 9661 | 3.9% | |
| 5 | 7556 | 3.0% | |
| 6 | 6395 | 2.6% | |
| 7 | 5328 | 2.1% | |
| 8 | 4426 | 1.8% | |
| 9 | 3986 | 1.6% | |
| 10 | 3653 | 1.5% | |
| Other values (9197) | 133611 | 53.8% |
| Value | Count | Frequency (%) | |
| 1 | 40331 | 16.2% | |
| 2 | 20200 | 8.1% | |
| 3 | 13111 | 5.3% | |
| 4 | 9661 | 3.9% | |
| 5 | 7556 | 3.0% |
| Value | Count | Frequency (%) | |
| 239907 | 1 | < 0.1% | |
| 232508 | 1 | < 0.1% | |
| 231005 | 1 | < 0.1% | |
| 227757 | 1 | < 0.1% | |
| 219697 | 1 | < 0.1% |
| Distinct | 7490 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7471.30391 |
|---|---|
| Minimum | 1 |
| Maximum | 210005 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 35 |
| Q1 | 367 |
| median | 1608 |
| Q3 | 6094 |
| 95-th percentile | 36157 |
| Maximum | 210005 |
| Range | 210004 |
| Interquartile range (IQR) | 5727 |
Descriptive statistics
| Standard deviation | 17518.84739 |
|---|---|
| Coefficient of variation (CV) | 2.344817934 |
| Kurtosis | 32.78573724 |
| Mean | 7471.30391 |
| Median Absolute Deviation (MAD) | 1477 |
| Skewness | 5.010108012 |
| Sum | 1854810966 |
| Variance | 306910014 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 21 | 452 | 0.2% | |
| 9 | 438 | 0.2% | |
| 4 | 433 | 0.2% | |
| 20 | 415 | 0.2% | |
| 17 | 414 | 0.2% | |
| 14 | 404 | 0.2% | |
| 2 | 403 | 0.2% | |
| 27 | 394 | 0.2% | |
| 8 | 391 | 0.2% | |
| 19 | 388 | 0.2% | |
| Other values (7480) | 244126 | 98.3% |
| Value | Count | Frequency (%) | |
| 1 | 342 | 0.1% | |
| 2 | 403 | 0.2% | |
| 3 | 331 | 0.1% | |
| 4 | 433 | 0.2% | |
| 5 | 345 | 0.1% |
| Value | Count | Frequency (%) | |
| 210005 | 25 | < 0.1% | |
| 209295 | 19 | < 0.1% | |
| 205192 | 17 | < 0.1% | |
| 202588 | 17 | < 0.1% | |
| 200190 | 16 | < 0.1% |
| Distinct | 8257 |
|---|---|
| Distinct (%) | 3.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10488.71636 |
|---|---|
| Minimum | 1 |
| Maximum | 340654 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 43 |
| Q1 | 471 |
| median | 2184 |
| Q3 | 8495 |
| 95-th percentile | 50492 |
| Maximum | 340654 |
| Range | 340653 |
| Interquartile range (IQR) | 8024 |
Descriptive statistics
| Standard deviation | 25354.07365 |
|---|---|
| Coefficient of variation (CV) | 2.417271358 |
| Kurtosis | 36.96084773 |
| Mean | 10488.71636 |
| Median Absolute Deviation (MAD) | 2021 |
| Skewness | 5.281167133 |
| Sum | 2603907747 |
| Variance | 642829050.4 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 46 | 370 | 0.1% | |
| 4 | 363 | 0.1% | |
| 19 | 348 | 0.1% | |
| 2 | 341 | 0.1% | |
| 7 | 340 | 0.1% | |
| 18 | 339 | 0.1% | |
| 13 | 337 | 0.1% | |
| 15 | 336 | 0.1% | |
| 11 | 331 | 0.1% | |
| 34 | 329 | 0.1% | |
| Other values (8247) | 244824 | 98.6% |
| Value | Count | Frequency (%) | |
| 1 | 298 | 0.1% | |
| 2 | 341 | 0.1% | |
| 3 | 315 | 0.1% | |
| 4 | 363 | 0.1% | |
| 5 | 313 | 0.1% |
| Value | Count | Frequency (%) | |
| 340654 | 25 | < 0.1% | |
| 338481 | 19 | < 0.1% | |
| 323773 | 20 | < 0.1% | |
| 300764 | 17 | < 0.1% | |
| 294010 | 17 | < 0.1% |
| Distinct | 241 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 654642.3406 |
|---|---|
| Minimum | 49 |
| Maximum | 1489781 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 49 |
|---|---|
| 5-th percentile | 37910 |
| Q1 | 246331 |
| median | 744820 |
| Q3 | 995699 |
| 95-th percentile | 1340532 |
| Maximum | 1489781 |
| Range | 1489732 |
| Interquartile range (IQR) | 749368 |
Descriptive statistics
| Standard deviation | 427431.7618 |
|---|---|
| Coefficient of variation (CV) | 0.6529241012 |
| Kurtosis | -1.147838894 |
| Mean | 654642.3406 |
| Median Absolute Deviation (MAD) | 321866 |
| Skewness | 0.02596502876 |
| Sum | 1.625201982e+11 |
| Variance | 1.82697911e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 881057 | 5102 | 2.1% | |
| 874401 | 4355 | 1.8% | |
| 843847 | 4348 | 1.8% | |
| 889324 | 4325 | 1.7% | |
| 865835 | 4263 | 1.7% | |
| 737536 | 4020 | 1.6% | |
| 1083151 | 3891 | 1.6% | |
| 1066686 | 3851 | 1.6% | |
| 1069100 | 3841 | 1.5% | |
| 1040555 | 3810 | 1.5% | |
| Other values (231) | 206452 | 83.2% |
| Value | Count | Frequency (%) | |
| 49 | 8 | < 0.1% | |
| 140 | 40 | < 0.1% | |
| 296 | 87 | < 0.1% | |
| 889 | 54 | < 0.1% | |
| 1069 | 100 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1489781 | 2976 | 1.2% | |
| 1450955 | 3054 | 1.2% | |
| 1422234 | 3564 | 1.4% | |
| 1340532 | 3541 | 1.4% | |
| 1333856 | 3547 | 1.4% |
| Distinct | 242 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1032064.413 |
|---|---|
| Minimum | 49 |
| Maximum | 2558785 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 49 |
|---|---|
| 5-th percentile | 41243 |
| Q1 | 351258 |
| median | 1010488 |
| Q3 | 1727801 |
| 95-th percentile | 2187018 |
| Maximum | 2558785 |
| Range | 2558736 |
| Interquartile range (IQR) | 1376543 |
Descriptive statistics
| Standard deviation | 727817.7627 |
|---|---|
| Coefficient of variation (CV) | 0.705205754 |
| Kurtosis | -0.9645056149 |
| Mean | 1032064.413 |
| Median Absolute Deviation (MAD) | 670622 |
| Skewness | 0.2829950506 |
| Sum | 2.562182471e+11 |
| Variance | 5.297186956e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1211801 | 5102 | 2.1% | |
| 1281531 | 4355 | 1.8% | |
| 1215953 | 4348 | 1.8% | |
| 1305530 | 4325 | 1.7% | |
| 1273792 | 4263 | 1.7% | |
| 1053212 | 4020 | 1.6% | |
| 2553407 | 3891 | 1.6% | |
| 2496304 | 3851 | 1.6% | |
| 2558785 | 3841 | 1.5% | |
| 2068196 | 3810 | 1.5% | |
| Other values (232) | 206452 | 83.2% |
| Value | Count | Frequency (%) | |
| 49 | 8 | < 0.1% | |
| 140 | 40 | < 0.1% | |
| 296 | 9 | < 0.1% | |
| 302 | 78 | < 0.1% | |
| 895 | 54 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2558785 | 3841 | 1.5% | |
| 2553407 | 3891 | 1.6% | |
| 2496304 | 3851 | 1.6% | |
| 2187018 | 3757 | 1.5% | |
| 2068196 | 3810 | 1.5% |
| Distinct | 3061 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 40673 |
| Missing (%) | 16.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3465.36703 |
|---|---|
| Minimum | 70 |
| Maximum | 287220 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 70 |
|---|---|
| 5-th percentile | 140 |
| Q1 | 455 |
| median | 1210 |
| Q3 | 3935 |
| 95-th percentile | 13130 |
| Maximum | 287220 |
| Range | 287150 |
| Interquartile range (IQR) | 3480 |
Descriptive statistics
| Standard deviation | 6601.384753 |
|---|---|
| Coefficient of variation (CV) | 1.904959762 |
| Kurtosis | 177.7395121 |
| Mean | 3465.36703 |
| Median Absolute Deviation (MAD) | 985 |
| Skewness | 8.099765665 |
| Sum | 719358215 |
| Variance | 43578280.66 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 160 | 1804 | 0.7% | |
| 110 | 1667 | 0.7% | |
| 105 | 1585 | 0.6% | |
| 180 | 1400 | 0.6% | |
| 300 | 1228 | 0.5% | |
| 120 | 1225 | 0.5% | |
| 145 | 1198 | 0.5% | |
| 140 | 1197 | 0.5% | |
| 500 | 1129 | 0.5% | |
| 310 | 1106 | 0.4% | |
| Other values (3051) | 194046 | 78.2% | |
| (Missing) | 40673 | 16.4% |
| Value | Count | Frequency (%) | |
| 70 | 226 | 0.1% | |
| 75 | 75 | < 0.1% | |
| 80 | 360 | 0.1% | |
| 85 | 852 | 0.3% | |
| 90 | 541 | 0.2% |
| Value | Count | Frequency (%) | |
| 287220 | 8 | < 0.1% | |
| 148910 | 3 | < 0.1% | |
| 142855 | 4 | < 0.1% | |
| 122155 | 4 | < 0.1% | |
| 116765 | 3 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 2020-07-21 | 2020-07-01 | 2013-01-01 | 320 | 709 | 979001188 | 1 | 1 | 2117 | 2547 | 1021797 | 1642254 | 28685.0 |
| 1 | 1.0 | 2020-07-21 | 2020-07-01 | 2013-01-01 | 320 | 709 | 99899051 | 7 | 7 | 2117 | 2547 | 1021797 | 1642254 | 1005.0 |
| 2 | 1.0 | 2020-07-21 | 2020-07-01 | 2013-01-01 | 320 | 709 | 979001223 | 2 | 2 | 2117 | 2547 | 1021797 | 1642254 | 6910.0 |
| 3 | 1.0 | 2020-07-21 | 2020-07-01 | 2013-01-01 | 320 | 709 | 99899056 | 70 | 72 | 2117 | 2547 | 1021797 | 1642254 | 5380.0 |
| 4 | 1.0 | 2020-07-21 | 2020-07-01 | 2013-01-01 | 320 | 709 | 99899026 | 455 | 465 | 2117 | 2547 | 1021797 | 1642254 | 210.0 |
| 5 | 1.0 | 2020-07-21 | 2020-07-01 | 2013-01-01 | 320 | 709 | 979001220 | 1 | 1 | 2117 | 2547 | 1021797 | 1642254 | 4860.0 |
| 6 | 1.0 | 2020-07-21 | 2020-07-01 | 2013-01-01 | 320 | 709 | 99899052 | 3 | 3 | 2117 | 2547 | 1021797 | 1642254 | NaN |
| 7 | 1.0 | 2020-07-21 | 2020-07-01 | 2013-01-01 | 320 | 709 | 99899012 | 877 | 918 | 2117 | 2547 | 1021797 | 1642254 | 480.0 |
| 8 | 1.0 | 2020-07-21 | 2020-07-01 | 2013-01-01 | 320 | 709 | 99899028 | 277 | 291 | 2117 | 2547 | 1021797 | 1642254 | 6525.0 |
| 9 | 1.0 | 2020-07-21 | 2020-07-01 | 2013-01-01 | 320 | 709 | 979001219 | 1 | 1 | 2117 | 2547 | 1021797 | 1642254 | 6540.0 |
Last rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 248248 | 1.0 | 2020-07-21 | 2020-07-01 | 2018-01-01 | 327 | 0216 | 990027136 | 1 | 1 | 2932 | 5560 | 186223 | 340829 | 26050.0 |
| 248249 | 1.0 | 2020-07-21 | 2020-07-01 | 2018-01-01 | 327 | 0216 | 990027146 | 88 | 90 | 2932 | 5560 | 186223 | 340829 | 23880.0 |
| 248250 | 1.0 | 2020-07-21 | 2020-07-01 | 2018-01-01 | 327 | 0216 | 990027147 | 13 | 13 | 2932 | 5560 | 186223 | 340829 | NaN |
| 248251 | 1.0 | 2020-07-21 | 2020-07-01 | 2018-01-01 | 327 | 0216 | 990027131 | 84 | 92 | 2932 | 5560 | 186223 | 340829 | 165.0 |
| 248252 | 1.0 | 2020-07-21 | 2020-07-01 | 2018-01-01 | 327 | 0216 | 990027135 | 1 | 1 | 2932 | 5560 | 186223 | 340829 | 40575.0 |
| 248253 | 1.0 | 2020-07-21 | 2020-07-01 | 2018-01-01 | 327 | 0216 | 990027144 | 10 | 10 | 2932 | 5560 | 186223 | 340829 | NaN |
| 248254 | 1.0 | 2020-07-21 | 2020-07-01 | 2018-01-01 | 327 | 0216 | 990027150 | 41 | 44 | 2932 | 5560 | 186223 | 340829 | NaN |
| 248255 | 1.0 | 2020-07-21 | 2020-07-01 | 2018-01-01 | 327 | 0216 | 990027199 | 326 | 358 | 2932 | 5560 | 186223 | 340829 | 850.0 |
| 248256 | 1.0 | 2020-07-21 | 2020-07-01 | 2018-01-01 | 327 | 0216 | 990027151 | 477 | 604 | 2932 | 5560 | 186223 | 340829 | 3495.0 |
| 248257 | 1.0 | 2020-07-21 | 2020-07-01 | 2018-01-01 | 327 | 0216 | 990027198 | 2532 | 4263 | 2932 | 5560 | 186223 | 340829 | 220.0 |